Overview

Dataset statistics

Number of variables9
Number of observations3192
Missing cells7224
Missing cells (%)25.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory224.6 KiB
Average record size in memory72.0 B

Variable types

NUM8
CAT1

Warnings

LandMaxTemperature is highly correlated with LandAverageTemperature and 2 other fieldsHigh correlation
LandAverageTemperature is highly correlated with LandMaxTemperature and 2 other fieldsHigh correlation
LandMinTemperature is highly correlated with LandAverageTemperature and 2 other fieldsHigh correlation
LandAndOceanAverageTemperature is highly correlated with LandAverageTemperature and 2 other fieldsHigh correlation
LandAndOceanAverageTemperatureUncertainty is highly correlated with LandAverageTemperatureUncertaintyHigh correlation
LandAverageTemperatureUncertainty is highly correlated with LandAndOceanAverageTemperatureUncertaintyHigh correlation
LandMaxTemperature has 1200 (37.6%) missing values Missing
LandMaxTemperatureUncertainty has 1200 (37.6%) missing values Missing
LandMinTemperature has 1200 (37.6%) missing values Missing
LandMinTemperatureUncertainty has 1200 (37.6%) missing values Missing
LandAndOceanAverageTemperature has 1200 (37.6%) missing values Missing
LandAndOceanAverageTemperatureUncertainty has 1200 (37.6%) missing values Missing
dt has unique values Unique

Reproduction

Analysis started2022-09-18 17:53:31.617489
Analysis finished2022-09-18 17:53:55.341142
Duration23.72 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

dt
Categorical

UNIQUE

Distinct3192
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size24.9 KiB
1750-01-01
 
1
1926-09-01
 
1
1926-11-01
 
1
1926-12-01
 
1
1927-01-01
 
1
Other values (3187)
3187 
ValueCountFrequency (%) 
1750-01-011< 0.1%
 
1926-09-011< 0.1%
 
1926-11-011< 0.1%
 
1926-12-011< 0.1%
 
1927-01-011< 0.1%
 
1927-02-011< 0.1%
 
1927-03-011< 0.1%
 
1927-04-011< 0.1%
 
1927-05-011< 0.1%
 
1927-06-011< 0.1%
 
Other values (3182)318299.7%
 
2022-09-18T12:53:55.646327image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique3192 ?
Unique (%)100.0%
2022-09-18T12:53:56.058225image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length10
Min length10

LandAverageTemperature
Real number (ℝ)

HIGH CORRELATION

Distinct2839
Distinct (%)89.3%
Missing12
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean8.374731132
Minimum-2.08
Maximum19.021
Zeros0
Zeros (%)0.0%
Memory size24.9 KiB
2022-09-18T12:53:56.494683image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-2.08
5-th percentile1.97075
Q14.312
median8.6105
Q312.54825
95-th percentile14.395
Maximum19.021
Range21.101
Interquartile range (IQR)8.23625

Descriptive statistics

Standard deviation4.381309771
Coefficient of variation (CV)0.5231582604
Kurtosis-1.342072459
Mean8.374731132
Median Absolute Deviation (MAD)4.1565
Skewness-0.08142566548
Sum26631.645
Variance19.19587531
MonotocityNot monotonic
2022-09-18T12:53:56.840534image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
13.76540.1%
 
13.29340.1%
 
2.03930.1%
 
11.09730.1%
 
14.24230.1%
 
12.24730.1%
 
2.73730.1%
 
14.74230.1%
 
3.09930.1%
 
3.9230.1%
 
Other values (2829)314898.6%
 
(Missing)120.4%
 
ValueCountFrequency (%) 
-2.081< 0.1%
 
-1.5031< 0.1%
 
-1.4311< 0.1%
 
-1.3851< 0.1%
 
-1.2491< 0.1%
 
ValueCountFrequency (%) 
19.0211< 0.1%
 
17.911< 0.1%
 
17.611< 0.1%
 
17.1151< 0.1%
 
16.8211< 0.1%
 

LandAverageTemperatureUncertainty
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1594
Distinct (%)50.1%
Missing12
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean0.9384679245
Minimum0.034
Maximum7.88
Zeros0
Zeros (%)0.0%
Memory size24.9 KiB
2022-09-18T12:53:57.044290image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.034
5-th percentile0.066
Q10.18675
median0.392
Q31.41925
95-th percentile3.2351
Maximum7.88
Range7.846
Interquartile range (IQR)1.2325

Descriptive statistics

Standard deviation1.096439795
Coefficient of variation (CV)1.168329536
Kurtosis3.536050467
Mean0.9384679245
Median Absolute Deviation (MAD)0.31
Skewness1.780596521
Sum2984.328
Variance1.202180224
MonotocityNot monotonic
2022-09-18T12:53:57.287649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.087200.6%
 
0.064190.6%
 
0.077160.5%
 
0.078160.5%
 
0.068140.4%
 
0.086140.4%
 
0.082140.4%
 
0.085130.4%
 
0.084130.4%
 
0.07120.4%
 
Other values (1584)302994.9%
 
ValueCountFrequency (%) 
0.0341< 0.1%
 
0.03520.1%
 
0.0361< 0.1%
 
0.0391< 0.1%
 
0.041< 0.1%
 
ValueCountFrequency (%) 
7.881< 0.1%
 
7.4921< 0.1%
 
7.3491< 0.1%
 
6.4151< 0.1%
 
6.3411< 0.1%
 

LandMaxTemperature
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct1814
Distinct (%)91.1%
Missing1200
Missing (%)37.6%
Infinite0
Infinite (%)0.0%
Mean14.3506009
Minimum5.9
Maximum21.32
Zeros0
Zeros (%)0.0%
Memory size24.9 KiB
2022-09-18T12:53:57.520026image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum5.9
5-th percentile8.118
Q110.212
median14.76
Q318.4515
95-th percentile20.1625
Maximum21.32
Range15.42
Interquartile range (IQR)8.2395

Descriptive statistics

Standard deviation4.309578966
Coefficient of variation (CV)0.3003065164
Kurtosis-1.456171165
Mean14.3506009
Median Absolute Deviation (MAD)4.137
Skewness-0.09693800875
Sum28586.397
Variance18.57247086
MonotocityNot monotonic
2022-09-18T12:53:57.706527image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
10.78130.1%
 
8.55530.1%
 
19.75330.1%
 
11.41130.1%
 
17.71330.1%
 
20.03730.1%
 
17.19630.1%
 
19.3630.1%
 
19.8530.1%
 
19.98730.1%
 
Other values (1804)196261.5%
 
(Missing)120037.6%
 
ValueCountFrequency (%) 
5.91< 0.1%
 
6.4211< 0.1%
 
6.4361< 0.1%
 
6.6421< 0.1%
 
6.6791< 0.1%
 
ValueCountFrequency (%) 
21.321< 0.1%
 
21.1991< 0.1%
 
21.1081< 0.1%
 
21.0851< 0.1%
 
21.00620.1%
 

LandMaxTemperatureUncertainty
Real number (ℝ≥0)

MISSING

Distinct841
Distinct (%)42.2%
Missing1200
Missing (%)37.6%
Infinite0
Infinite (%)0.0%
Mean0.4797816265
Minimum0.044
Maximum4.373
Zeros0
Zeros (%)0.0%
Memory size24.9 KiB
2022-09-18T12:53:57.906026image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.044
5-th percentile0.083
Q10.142
median0.252
Q30.539
95-th percentile1.86245
Maximum4.373
Range4.329
Interquartile range (IQR)0.397

Descriptive statistics

Standard deviation0.5832029575
Coefficient of variation (CV)1.215559174
Kurtosis7.55348287
Mean0.4797816265
Median Absolute Deviation (MAD)0.136
Skewness2.565863891
Sum955.725
Variance0.3401256896
MonotocityNot monotonic
2022-09-18T12:53:58.088505image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.093140.4%
 
0.105120.4%
 
0.098110.3%
 
0.094110.3%
 
0.106110.3%
 
0.13110.3%
 
0.16100.3%
 
0.099100.3%
 
0.179100.3%
 
0.0990.3%
 
Other values (831)188359.0%
 
(Missing)120037.6%
 
ValueCountFrequency (%) 
0.0441< 0.1%
 
0.0481< 0.1%
 
0.0521< 0.1%
 
0.05530.1%
 
0.0561< 0.1%
 
ValueCountFrequency (%) 
4.3731< 0.1%
 
4.241< 0.1%
 
4.1641< 0.1%
 
3.7511< 0.1%
 
3.4911< 0.1%
 

LandMinTemperature
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct1873
Distinct (%)94.0%
Missing1200
Missing (%)37.6%
Infinite0
Infinite (%)0.0%
Mean2.743595382
Minimum-5.407
Maximum9.715
Zeros0
Zeros (%)0.0%
Memory size24.9 KiB
2022-09-18T12:53:58.321885image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-5.407
5-th percentile-3.32445
Q1-1.3345
median2.9495
Q36.77875
95-th percentile8.51045
Maximum9.715
Range15.122
Interquartile range (IQR)8.11325

Descriptive statistics

Standard deviation4.15583532
Coefficient of variation (CV)1.514740602
Kurtosis-1.433529954
Mean2.743595382
Median Absolute Deviation (MAD)4.088
Skewness-0.05025501431
Sum5465.242
Variance17.27096721
MonotocityNot monotonic
2022-09-18T12:53:58.635321image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
8.16130.1%
 
7.81830.1%
 
-1.13930.1%
 
7.89230.1%
 
8.18430.1%
 
3.02520.1%
 
-2.5220.1%
 
-1.16220.1%
 
-1.4320.1%
 
0.62920.1%
 
Other values (1863)196761.6%
 
(Missing)120037.6%
 
ValueCountFrequency (%) 
-5.4071< 0.1%
 
-5.3451< 0.1%
 
-4.9471< 0.1%
 
-4.7171< 0.1%
 
-4.6781< 0.1%
 
ValueCountFrequency (%) 
9.7151< 0.1%
 
9.6841< 0.1%
 
9.5691< 0.1%
 
9.5511< 0.1%
 
9.4821< 0.1%
 

LandMinTemperatureUncertainty
Real number (ℝ≥0)

MISSING

Distinct781
Distinct (%)39.2%
Missing1200
Missing (%)37.6%
Infinite0
Infinite (%)0.0%
Mean0.4318488956
Minimum0.045
Maximum3.498
Zeros0
Zeros (%)0.0%
Memory size24.9 KiB
2022-09-18T12:53:58.916580image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.045
5-th percentile0.08455
Q10.155
median0.279
Q30.45825
95-th percentile1.3948
Maximum3.498
Range3.453
Interquartile range (IQR)0.30325

Descriptive statistics

Standard deviation0.4458378371
Coefficient of variation (CV)1.032393139
Kurtosis7.0548683
Mean0.4318488956
Median Absolute Deviation (MAD)0.135
Skewness2.384389692
Sum860.243
Variance0.198771377
MonotocityNot monotonic
2022-09-18T12:53:59.141435image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.237120.4%
 
0.082110.3%
 
0.145110.3%
 
0.13110.3%
 
0.126110.3%
 
0.338100.3%
 
0.224100.3%
 
0.213100.3%
 
0.12790.3%
 
0.12590.3%
 
Other values (771)188859.1%
 
(Missing)120037.6%
 
ValueCountFrequency (%) 
0.0451< 0.1%
 
0.0471< 0.1%
 
0.05130.1%
 
0.0531< 0.1%
 
0.05420.1%
 
ValueCountFrequency (%) 
3.4981< 0.1%
 
3.4281< 0.1%
 
2.9631< 0.1%
 
2.9291< 0.1%
 
2.8431< 0.1%
 

LandAndOceanAverageTemperature
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct1596
Distinct (%)80.1%
Missing1200
Missing (%)37.6%
Infinite0
Infinite (%)0.0%
Mean15.21256576
Minimum12.475
Maximum17.611
Zeros0
Zeros (%)0.0%
Memory size24.9 KiB
2022-09-18T12:53:59.385297image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum12.475
5-th percentile13.3
Q114.047
median15.251
Q316.39625
95-th percentile17.0166
Maximum17.611
Range5.136
Interquartile range (IQR)2.34925

Descriptive statistics

Standard deviation1.274092954
Coefficient of variation (CV)0.08375266699
Kurtosis-1.322464368
Mean15.21256576
Median Absolute Deviation (MAD)1.179
Skewness-0.05604937795
Sum30303.431
Variance1.623312857
MonotocityNot monotonic
2022-09-18T12:53:59.610693image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
15.00550.2%
 
16.59640.1%
 
13.2640.1%
 
13.31140.1%
 
15.92740.1%
 
16.49640.1%
 
16.84640.1%
 
16.78340.1%
 
13.70430.1%
 
13.5430.1%
 
Other values (1586)195361.2%
 
(Missing)120037.6%
 
ValueCountFrequency (%) 
12.4751< 0.1%
 
12.621< 0.1%
 
12.6581< 0.1%
 
12.7021< 0.1%
 
12.7321< 0.1%
 
ValueCountFrequency (%) 
17.6111< 0.1%
 
17.6091< 0.1%
 
17.6071< 0.1%
 
17.5891< 0.1%
 
17.5781< 0.1%
 

LandAndOceanAverageTemperatureUncertainty
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct294
Distinct (%)14.8%
Missing1200
Missing (%)37.6%
Infinite0
Infinite (%)0.0%
Mean0.1285321285
Minimum0.042
Maximum0.457
Zeros0
Zeros (%)0.0%
Memory size24.9 KiB
2022-09-18T12:53:59.872993image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.042
5-th percentile0.052
Q10.063
median0.122
Q30.151
95-th percentile0.28345
Maximum0.457
Range0.415
Interquartile range (IQR)0.088

Descriptive statistics

Standard deviation0.07358679601
Coefficient of variation (CV)0.5725167462
Kurtosis1.525069706
Mean0.1285321285
Median Absolute Deviation (MAD)0.0535
Skewness1.275594309
Sum256.036
Variance0.005415016546
MonotocityNot monotonic
2022-09-18T12:54:00.226208image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.061491.5%
 
0.059471.5%
 
0.06451.4%
 
0.062441.4%
 
0.057411.3%
 
0.058391.2%
 
0.056371.2%
 
0.054351.1%
 
0.063351.1%
 
0.064311.0%
 
Other values (284)158949.8%
 
(Missing)120037.6%
 
ValueCountFrequency (%) 
0.0421< 0.1%
 
0.0431< 0.1%
 
0.04530.1%
 
0.04630.1%
 
0.04740.1%
 
ValueCountFrequency (%) 
0.4571< 0.1%
 
0.4421< 0.1%
 
0.4381< 0.1%
 
0.4271< 0.1%
 
0.4171< 0.1%
 

Interactions

2022-09-18T12:53:39.756403image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:40.003739image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:40.270521image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:40.464008image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:40.623582image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:40.807089image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:41.007594image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:41.185635image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:41.380113image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:41.553683image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:41.726211image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:41.906240image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:42.068805image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:42.246330image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:42.430837image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:42.595396image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:42.762462image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:42.952471image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:43.123182image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:43.339022image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:43.509540image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:43.681036image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:44.019141image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:44.201654image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:44.389150image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:44.542772image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:44.694370image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:44.870472image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:45.008108image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:45.162514image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:45.316644image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:45.468059image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:45.616661image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:45.792192image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:45.968721image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:46.141258image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:46.302857image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:46.494316image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:46.663867image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:46.837656image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:47.019259image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:47.195787image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:47.390266image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:47.596714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:47.800172image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:48.038553image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:48.256974image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:48.461427image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:48.645956image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:48.940328image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:49.093951image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:49.826957image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:50.279747image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:50.723305image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:51.107790image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:51.643367image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:51.941593image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:52.187931image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:52.402358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:52.870108image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:53.037660image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:53.230151image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:53.435719image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:53.624213image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-09-18T12:54:00.486512image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-09-18T12:54:00.945302image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-09-18T12:54:01.624778image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-09-18T12:54:01.969854image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-09-18T12:53:53.945354image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:54.459990image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:54.845958image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-09-18T12:53:55.167607image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Sample

First rows

dtLandAverageTemperatureLandAverageTemperatureUncertaintyLandMaxTemperatureLandMaxTemperatureUncertaintyLandMinTemperatureLandMinTemperatureUncertaintyLandAndOceanAverageTemperatureLandAndOceanAverageTemperatureUncertainty
01750-01-013.0343.574NaNNaNNaNNaNNaNNaN
11750-02-013.0833.702NaNNaNNaNNaNNaNNaN
21750-03-015.6263.076NaNNaNNaNNaNNaNNaN
31750-04-018.4902.451NaNNaNNaNNaNNaNNaN
41750-05-0111.5732.072NaNNaNNaNNaNNaNNaN
51750-06-0112.9371.724NaNNaNNaNNaNNaNNaN
61750-07-0115.8681.911NaNNaNNaNNaNNaNNaN
71750-08-0114.7502.231NaNNaNNaNNaNNaNNaN
81750-09-0111.4132.637NaNNaNNaNNaNNaNNaN
91750-10-016.3672.668NaNNaNNaNNaNNaNNaN

Last rows

dtLandAverageTemperatureLandAverageTemperatureUncertaintyLandMaxTemperatureLandMaxTemperatureUncertaintyLandMinTemperatureLandMinTemperatureUncertaintyLandAndOceanAverageTemperatureLandAndOceanAverageTemperatureUncertainty
31822015-03-016.7400.06012.6590.0960.8940.07915.1930.061
31832015-04-019.3130.08815.2240.1373.4020.14715.9620.061
31842015-05-0112.3120.08118.1810.1176.3130.15316.7740.058
31852015-06-0114.5050.06820.3640.1338.6270.16817.3900.057
31862015-07-0115.0510.08620.9040.1099.3260.22517.6110.058
31872015-08-0114.7550.07220.6990.1109.0050.17017.5890.057
31882015-09-0112.9990.07918.8450.0887.1990.22917.0490.058
31892015-10-0110.8010.10216.4500.0595.2320.11516.2900.062
31902015-11-017.4330.11912.8920.0932.1570.10615.2520.063
31912015-12-015.5180.10010.7250.1540.2870.09914.7740.062